scholarly journals Language Identification based on Support Vector Machine using GMM Super vectors

This paper proposes a novel approach that combines the power of generative Gaussian mixture models (GMM) and discriminative support vector machines (SVM). The main objective this paper is to incorporating the GMM super vectors based on SVM classifier for language identification (LID) task. The GMM based LID system to capture all the variations present in phonotactic constraints imposed by the language requires large amount of training data. The Gaussian mixture model (GMM)- universal background model (UBM) modeling require less amount of training data. In GMM-UBM LID system, a language model is created by maximum a posterior (MAP) adaptation of the means of the universal background model (UBM). Here the GMM super vectors are created by concatenating the means of the adapted mixture components from UBM. Then these super vectors are applied to a SVM for classification purpose. In this paper, the performance of GMM-UBM LID system based on SVM is compared with the conventional GMM LID system. Form the performance analysis it is found that GMM-UBM LID system based on SVM is performed well when compared to GMM based LID system.

The most of the existing LID systems based on the Gaussian Mixture model. The main requirement of the GMM based LID system is it require large amount of speech data to train the GMM model. Most of the Indian languages have the similarity because they are derived from Devanagari. Even though common phonemes exists in phoneme sets across the Indian languages, each language contain its unique phonotactic constraints imposed by the language. Any modeling technique capable of capturing all these slight variations imposed by the language is one of the important language identification cue. To model the GMM based LID system which captures above variations it require large number of mixture components.To model the large number of mixture components using Gaussian Mixture Model (GMM), the technique requires a large number of training data for each language class, which is very difficult to get for Indian languages. The main objective of GMM-UBM based LID system is it require less amount of training data to train(model) the system. In this paper, the importance of GMM-UBM modeling for language identification (LID) task for Indian languages are explored using new set of feature vectors. In GMM-UBM LID system based on the new feature vectors, the phonotactic variations imparted by different Indian languages are modeled using Gaussian Mixture model and Universal Background Model (GMM-UBM) technique. In this type of modeling, some amount of data from each class of language is pooled to create a universal background model. From this UBM model each model class is adapted. In this study, it is found that the performance of new feature vectors GMM-UBM based LID system is superior when compared to conventional new feature vectors based GMM LID system.


2016 ◽  
Vol 9 (1) ◽  
pp. 36-40
Author(s):  
Renu Singh ◽  
Arvind Singh ◽  
Utpal Bhattacharjee

This paper presents a reviewof various speaker verification approaches in realistic world, and explore a combinational approach between Gaussian Mixture Model (GMM) and Support Vector Machine (SVM) as well as Gaussian Mixture Model (GMM) and Universal Background Model (UBM).


2020 ◽  
Vol 29 (54) ◽  
pp. e11752
Author(s):  
Néstor David Rendón-Hurtado ◽  
Claudia Victoria Isaza-Narváez ◽  
Susana Rodríguez-Buriticá

Hoy, los métodos de aprendizaje automático se han convertido en una herramienta para ayudar a frenar los efectos del calentamiento global, al resolver cuestiones ecológicas. En particular, el bosque seco tropical (BST) de Colombia se encuentra actualmente amenazado por la deforestación generada, desde la época colonial, por la ganadería, la minería y el desarrollo urbano. Uno de los desafíos urgentes en esta área es comprender la transformacion y degradación de los bosques. Tradicionalmente, los cambios de los ecosistemas se miden por varios niveles de transformación (alto, medio, bajo). Estos se obtienen a través de observación directa, recuento de especies y medidas de variación espacial a lo largo del tiempo. Por ende, estos métodos son invasivos y requieren de largos lapsos de observación en los lugares de estudio. Una alternativa eficaz a los métodos clásicos es el monitoreo acústico pasivo, que es menos invasivo, ya que evita el aislamiento de las especies y reduce el tiempo de los investigadores en los sitios. Sin embargo, implica la generación de múltiples datos y la necesidad de herramientas computacionales destinadas al análisis de las grabaciones. Este trabajo propone un método para identificar automáticamente la transformación del BST mediante grabaciones acústicas, aplicando dos modelos de clasificación: Gaussian Mixture Models (GMM), por cada región estudiada, y Universal Background Model (UBM), para un modelo general. Además, contiene un análisis de índices acústicos, con el fin de detectar los más representativos para las transformaciones del BST. Nuestra propuesta de GMM alcanzó una precisión de 93% y 89% para las regiones de La Guajira y Bolívar. El modelo general UBM logró 84% de precisión.


Informatics ◽  
2018 ◽  
Vol 5 (3) ◽  
pp. 38 ◽  
Author(s):  
Martin Jänicke ◽  
Bernhard Sick ◽  
Sven Tomforde

Personal wearables such as smartphones or smartwatches are increasingly utilized in everyday life. Frequently, activity recognition is performed on these devices to estimate the current user status and trigger automated actions according to the user’s needs. In this article, we focus on the creation of a self-adaptive activity recognition system based on IMU that includes new sensors during runtime. Starting with a classifier based on GMM, the density model is adapted to new sensor data fully autonomously by issuing the marginalization property of normal distributions. To create a classifier from that, label inference is done, either based on the initial classifier or based on the training data. For evaluation, we used more than 10 h of annotated activity data from the publicly available PAMAP2 benchmark dataset. Using the data, we showed the feasibility of our approach and performed 9720 experiments, to get resilient numbers. One approach performed reasonably well, leading to a system improvement on average, with an increase in the F-score of 0.0053, while the other one shows clear drawbacks due to a high loss of information during label inference. Furthermore, a comparison with state of the art techniques shows the necessity for further experiments in this area.


2020 ◽  
Vol 9 (2) ◽  
pp. 109 ◽  
Author(s):  
Bo Cheng ◽  
Shiai Cui ◽  
Xiaoxiao Ma ◽  
Chenbin Liang

Feature extraction of an urban area is one of the most important directions of polarimetric synthetic aperture radar (PolSAR) applications. A high-resolution PolSAR image has the characteristics of high dimensions and nonlinearity. Therefore, to find intrinsic features for target recognition, a building area extraction method for PolSAR images based on the Adaptive Neighborhoods selection Neighborhood Preserving Embedding (ANSNPE) algorithm is proposed. First, 52 features are extracted by using the Gray level co-occurrence matrix (GLCM) and five polarization decomposition methods. The feature set is divided into 20 dimensions, 36 dimensions, and 52 dimensions. Next, the ANSNPE algorithm is applied to the training samples, and the projection matrix is obtained for the test image to extract the new features. Lastly, the Support Vector machine (SVM) classifier and post processing are used to extract the building area, and the accuracy is evaluated. Comparative experiments are conducted using Radarsat-2, and the results show that the ANSNPE algorithm could effectively extract the building area and that it had a better generalization ability; the projection matrix is obtained using the training data and could be directly applied to the new sample, and the building area extraction accuracy is above 80%. The combination of polarization and texture features provide a wealth of information that is more conducive to the extraction of building areas.


2013 ◽  
Vol 37 (3) ◽  
pp. 467-476 ◽  
Author(s):  
Ing-Jr Ding ◽  
Chih-Ta Yen ◽  
Zih-Jheng Lin

In this paper, a fuzzy logic-based intelligent control (FLIC) scheme for support vector machine (SVM) speaker verification, called FLICSVM, is developed. The proposed FLICSVM method enhances SVM training by considering the property of training utterances for establishing the SVM model and therefore could further ensure the robustness of the SVM classifier on speaker verification. In FLICSVM, when establishing the SVM model in the training procedure, the popular fuzzy control methodology is employed to tune certain specific SVM parameter according to the prior information of SVM training utterances that is derived from Gaussian mixture model (GMM) calculations. Experimental results demonstrated that proposed FLICSVM is apparently superior to conventional SVM in the recognition accuracy.


Author(s):  
DEBASHISH DEV MISHRA ◽  
UTPAL BHATTACHARJEE ◽  
SHIKHAR KUMAR SARMA

The performance of automatic speaker recognition (ASR) system degrades drastically in the presence of noise and other distortions, especially when there is a noise level mismatch between the training and testing environments. This paper explores the problem of speaker recognition in noisy conditions, assuming that speech signals are corrupted by noise. A major problem of most speaker recognition systems is their unsatisfactory performance in noisy environments. In this experimental research, we have studied a combination of Mel Frequency Cepstral Coefficients (MFCC) for feature extraction and Cepstral Mean Normalization (CMN) techniques for speech enhancement. Our system uses a Gaussian Mixture Models (GMM) classifier and is implemented under MATLAB®7 programming environment. The process involves the use of speaker data for both training and testing. The data used for testing is matched up against a speaker model, which is trained with the training data using GMM modeling. Finally, experiments are carried out to test the new model for ASR given limited training data and with differing levels and types of realistic background noise. The results have demonstrated the robustness of the new system.


Sign in / Sign up

Export Citation Format

Share Document