scholarly journals Application of Laplacian Mixture Model to Image and Video Retrieval

Author(s):  
Tahir Amin

In this study we present a new approach to feature extraction for image and video retrieval. A Laplacian mixture model is proposed to model the peaky distributions of the wavelet coefficients. The proposed method extracts a low dimensional feature vector which is very important for the retrieval efficiency of the system in terms of response time. Although the importance of effective feature set cannot be overemphasized, yet it is very hard to describe image similarity with only low level features. Learning from the user feedback may enhance the system performance significantly. This approach, known as the relevance feedback, is adopted to further improve the efficiency of the system. The system learns from the user input in the form of positive and negative examples. The parameters of the system are modified by the user behavior. The parameters of the Laplacian mixture model are used to represent texture information of the images. The experimental evaluation indicates the high discriminatory power of the proposed features. The traditional measures of distance between two vectors like city-block or Euclidean are linear in nature. The human visual system does not follow this simple linear model. Therefore, a non-linear approach to the distance measure for defining the similarity between the two images is also explored in this work. It is observed that non-linear modelling of similarity yields more satisfactory performance and increases the retrieval performance by 7.5 per cent. Video is primarily mult-model, i.e., it contains different media components like audio, speech, visual information (frames) and caption (text). Traditionally, visual information is used for the video indexing and retrieval. The visual contents in the videos are very important; however, in some cases visual information is not very helpful for finding clues to the events. For example, certain action sequences such as goal events in a soccer game and explosion in a news video are easier to identify in the audio domain than in the visual domain. Since the proposed feature extraction scheme is based on the shape of the wavelet coefficient distribution, therefore it can also be applied to analyze the embedded audio contents of the video. We use audio information for indexing video clips. A feedback mechanism is also studied to improve the performance of the system.

2021 ◽  
Author(s):  
Tahir Amin

In this study we present a new approach to feature extraction for image and video retrieval. A Laplacian mixture model is proposed to model the peaky distributions of the wavelet coefficients. The proposed method extracts a low dimensional feature vector which is very important for the retrieval efficiency of the system in terms of response time. Although the importance of effective feature set cannot be overemphasized, yet it is very hard to describe image similarity with only low level features. Learning from the user feedback may enhance the system performance significantly. This approach, known as the relevance feedback, is adopted to further improve the efficiency of the system. The system learns from the user input in the form of positive and negative examples. The parameters of the system are modified by the user behavior. The parameters of the Laplacian mixture model are used to represent texture information of the images. The experimental evaluation indicates the high discriminatory power of the proposed features. The traditional measures of distance between two vectors like city-block or Euclidean are linear in nature. The human visual system does not follow this simple linear model. Therefore, a non-linear approach to the distance measure for defining the similarity between the two images is also explored in this work. It is observed that non-linear modelling of similarity yields more satisfactory performance and increases the retrieval performance by 7.5 per cent. Video is primarily mult-model, i.e., it contains different media components like audio, speech, visual information (frames) and caption (text). Traditionally, visual information is used for the video indexing and retrieval. The visual contents in the videos are very important; however, in some cases visual information is not very helpful for finding clues to the events. For example, certain action sequences such as goal events in a soccer game and explosion in a news video are easier to identify in the audio domain than in the visual domain. Since the proposed feature extraction scheme is based on the shape of the wavelet coefficient distribution, therefore it can also be applied to analyze the embedded audio contents of the video. We use audio information for indexing video clips. A feedback mechanism is also studied to improve the performance of the system.


2021 ◽  
Vol 11 (7) ◽  
pp. 3138
Author(s):  
Mingchi Zhang ◽  
Xuemin Chen ◽  
Wei Li

In this paper, a deep neural network hidden Markov model (DNN-HMM) is proposed to detect pipeline leakage location. A long pipeline is divided into several sections and the leakage occurs in different section that is defined as different state of hidden Markov model (HMM). The hybrid HMM, i.e., DNN-HMM, consists of a deep neural network (DNN) with multiple layers to exploit the non-linear data. The DNN is initialized by using a deep belief network (DBN). The DBN is a pre-trained model built by stacking top-down restricted Boltzmann machines (RBM) that compute the emission probabilities for the HMM instead of Gaussian mixture model (GMM). Two comparative studies based on different numbers of states using Gaussian mixture model-hidden Markov model (GMM-HMM) and DNN-HMM are performed. The accuracy of the testing performance between detected state sequence and actual state sequence is measured by micro F1 score. The micro F1 score approaches 0.94 for GMM-HMM method and it is close to 0.95 for DNN-HMM method when the pipeline is divided into three sections. In the experiment that divides the pipeline as five sections, the micro F1 score for GMM-HMM is 0.69, while it approaches 0.96 with DNN-HMM method. The results demonstrate that the DNN-HMM can learn a better model of non-linear data and achieve better performance compared to GMM-HMM method.


Author(s):  
Nandini H. M. ◽  
Chethan H. K. ◽  
Rashmi B. S.

Shot boundary detection in videos is one of the most fundamental tasks towards content-based video retrieval and analysis. In this aspect, an efficient approach to detect abrupt and gradual transition in videos is presented. The proposed method detects the shot boundaries in videos by extracting block-based mean probability binary weight (MPBW) histogram from the normalized Kirsch magnitude frames as an amalgamation of local and global features. Abrupt transitions in videos are detected by utilizing the distance measure between consecutive MPBW histograms and employing an adaptive threshold. In the subsequent step, co-efficient of mean deviation and variance statistical measure is applied on MPBW histograms to detect gradual transitions in the video. Experiments were conducted on TRECVID 2001 and 2007 datasets to analyse and validate the proposed method. Experimental result shows significant improvement of the proposed SBD approach over some of the state-of-the-art algorithms in terms of recall, precision, and F1-score.


2017 ◽  
Vol E100.D (9) ◽  
pp. 2249-2252 ◽  
Author(s):  
Seongkyu MUN ◽  
Minkyu SHIN ◽  
Suwon SHON ◽  
Wooil KIM ◽  
David K. HAN ◽  
...  

2021 ◽  
pp. 1-46
Author(s):  
Donglin Zhu ◽  
Jingbin Cui ◽  
Yan Li ◽  
Zhonghong Wan ◽  
Lei Li

Seismic facies analysis can effectively estimate reservoir properties and seismic waveform clustering is a useful tool for facies analysis. We developed a deep learning-based clustering approach called the modified deep convolutional embedded clustering with adaptive Gaussian mixture model (AGMM-MDCEC) for seismic waveform clustering. Trainable feature extraction and clustering layers in AGMM-MDCEC are implemented using neural networks. The two independent processes of feature extraction and clustering are fused, such that extracted features are modified simultaneously with the results of clustering. A convolutional autoencoder is used in the algorithm for extracting features from seismic data and reduce data redundancy. At the same time, weights of clustering network are fined-tuned through iteration to obtain state-of-the-art clustering results. We apply our new classification algorithm to a data volume acquired in western China to map architectural elements of a complex fluvial depositional system. Our proposed method obtains superior results over those provided by traditional K-means, Gaussian mixture model, and some machine learning methods, and improves the mapping of the extent of the distributary system.


Sign in / Sign up

Export Citation Format

Share Document