Bayesian distance metric learning and its application in automatic speaker recognition systems

This paper proposes state-of the-art Automatic Speaker Recognition System (ASR) based on Bayesian Distance Learning Metric as a feature extractor. In this modeling, I explored the constraints of the distance between modified and simplified i-vector pairs by the same speaker and different speakers. An approximation of the distance metric is used as a weighted covariance matrix from the higher eigenvectors of the covariance matrix, which is used to estimate the posterior distribution of the metric distance. Given a speaker tag, I select the data pair of the different speakers with the highest cosine score to form a set of speaker constraints. This collection captures the most discriminating variability between the speakers in the training data. This Bayesian distance learning approach achieves better performance than the most advanced methods. Furthermore, this method is insensitive to normalization compared to cosine scores. This method is very effective in the case of limited training data. The modified supervised i-vector based ASR system is evaluated on the NIST SRE 2008 database. The best performance of the combined cosine score EER 1.767% obtained using LDA200 + NCA200 + LDA200, and the best performance of Bayes_dml EER 1.775% obtained using LDA200 + NCA200 + LDA100. Bayesian_dml overcomes the combined norm of cosine scores and is the best result of the short2-short3 condition report for NIST SRE 2008 data.

Download Full-text

Operational Multi-Modal Distance Metric Learning to Image Reclamation

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.32.15725 ◽

2018 ◽

Vol 7 (2.32) ◽

pp. 405

Author(s):

L Lavanya ◽

Chebrolu Ujwala Pavani ◽

Gadchanda Vineeth ◽

Borada Lavanya

Keyword(s):

Distance Learning ◽

Metric Learning ◽

Distance Metric Learning ◽

Distance Metric ◽

Metric Properties ◽

Training Scheme ◽

Different Types ◽

Metric Distance ◽

The Cost ◽

Feature Attribute

Distance learning is an eminent technique that improves the search for images based on content. Although widely studied, most DML approaches generally recognize a modalization training framework that teaches a metric distance or a combination of distances in which several types of characteristics are simply interconnected. DML methods of that type suffer some critical limitations (a) Some feature types can significantly overwhelm others with the DML assignment, due to different attributes, and (b) the distance learning standard in the combined metric properties can be consumed using the feature attribute approach combined. In this article we refer to these the restrictions are reviewed online- multimodal distance metric training scheme (OMDML), which explores a dual duplication online learning scheme. (c) learn to optimize the distance metric in each owner space separately; and (d) learn find the optimal combination of different types of characteristics. To overestimate the cost of DML in sophisticated areas, we offer a low level OMDML algorithm that not only reduces estimated costs, but also guarantees high accuracy. We are here carried out exhaustive experiments to estimate the performance of the algorithms proposed for the restoration of multimedia images.

Download Full-text

Automatic Speaker Recognition System

10.21236/ada197980 ◽

1984 ◽

Author(s):

Alan Higgins ◽

Joe Naylor

Keyword(s):

Speaker Recognition ◽

Recognition System ◽

Automatic Speaker Recognition

Download Full-text

The assessment of efficiency of the automatic speaker recognition system for voices registered using a throat microphone

XII Conference on Reconnaissance and Electronic Warfare Systems ◽

10.1117/12.2524591 ◽

2019 ◽

Author(s):

Kamil Kamiński ◽

Andrzej P. Dobrowolski ◽

Rafał Tatoń

Keyword(s):

Speaker Recognition ◽

Recognition System ◽

Automatic Speaker Recognition

Download Full-text

Automatic Speaker Recognition System based on Optimised Machine Learning Algorithms

2019 IEEE AFRICON ◽

10.1109/africon46755.2019.9133823 ◽

2019 ◽

Author(s):

Tumisho Billson Mokgonyane ◽

Tshephisho Joseph Sefara ◽

Thipe Isaiah Modipa ◽

Madimetja Jonas Manamela

Keyword(s):

Machine Learning ◽

Speaker Recognition ◽

Learning Algorithms ◽

Recognition System ◽

Machine Learning Algorithms ◽

Automatic Speaker Recognition

Download Full-text

Accelerometer based gesture recognition system using distance metric learning for nearest neighbour classification

2012 IEEE International Workshop on Machine Learning for Signal Processing ◽

10.1109/mlsp.2012.6349717 ◽

2012 ◽

Cited By ~ 5

Author(s):

Tea Marasovic ◽

Vladan Papic

Keyword(s):

Gesture Recognition ◽

Metric Learning ◽

Recognition System ◽

Distance Metric Learning ◽

Nearest Neighbour ◽

Distance Metric

Download Full-text

A distance metric based outliers detection for robust Automatic Speaker Recognition applications

2011 Annual IEEE India Conference ◽

10.1109/indcon.2011.6139358 ◽

2011 ◽

Cited By ~ 1

Author(s):

Israj Ali ◽

Goutam Saha

Keyword(s):

Speaker Recognition ◽

Distance Metric ◽

Automatic Speaker Recognition ◽

Outliers Detection

Download Full-text

Text Independent Automatic Speaker Recognition System using fusion of features

PRZEGLĄD ELEKTROTECHNICZNY ◽

10.15199/48.2015.10.52 ◽

2015 ◽

Vol 1 (10) ◽

pp. 249-253 ◽

Cited By ~ 1

Author(s):

Ewelina MAJDA-ZDANCEWICZ

Keyword(s):

Speaker Recognition ◽

Recognition System ◽

Automatic Speaker Recognition

Download Full-text

MFCC AND CMN BASED SPEAKER RECOGNITION IN NOISY ENVIRONMENT

International Journal of Electronics Signals and Systems ◽

10.47893/ijess.2013.1137 ◽

2013 ◽

pp. 48-51

Author(s):

DEBASHISH DEV MISHRA ◽

UTPAL BHATTACHARJEE ◽

SHIKHAR KUMAR SARMA

Keyword(s):

Speaker Recognition ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Training Data ◽

Noisy Environment ◽

Noisy Environments ◽

Mel Frequency Cepstral Coefficients ◽

Automatic Speaker Recognition ◽

Cepstral Mean Normalization ◽

Testing Environments

The performance of automatic speaker recognition (ASR) system degrades drastically in the presence of noise and other distortions, especially when there is a noise level mismatch between the training and testing environments. This paper explores the problem of speaker recognition in noisy conditions, assuming that speech signals are corrupted by noise. A major problem of most speaker recognition systems is their unsatisfactory performance in noisy environments. In this experimental research, we have studied a combination of Mel Frequency Cepstral Coefficients (MFCC) for feature extraction and Cepstral Mean Normalization (CMN) techniques for speech enhancement. Our system uses a Gaussian Mixture Models (GMM) classifier and is implemented under MATLAB®7 programming environment. The process involves the use of speaker data for both training and testing. The data used for testing is matched up against a speaker model, which is trained with the training data using GMM modeling. Finally, experiments are carried out to test the new model for ASR given limited training data and with differing levels and types of realistic background noise. The results have demonstrated the robustness of the new system.

Download Full-text

THE AUTOMATIC SPEAKER RECOGNITION SYSTEM OF CRITICAL USE CLASSIFIER OPTIMIZATION

Radio Electronics Computer Science Control ◽

10.15588/1607-3274-2018-2-4 ◽

2018 ◽

Vol 0 (2) ◽

Cited By ~ 1

Author(s):

O. V Bisikalo ◽

T. V. Grischuk ◽

V. V. Kovtun

Keyword(s):

Speaker Recognition ◽

Recognition System ◽

Automatic Speaker Recognition

Download Full-text

Data-Adaptive Metric Learning with Scale Alignment

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013347 ◽

2019 ◽

Vol 33 ◽

pp. 3347-3354 ◽

Cited By ~ 1

Author(s):

Shuo Chen ◽

Chen Gong ◽

Jian Yang ◽

Ying Tai ◽

Le Hui ◽

...

Keyword(s):

Metric Learning ◽

Projection Matrix ◽

Training Data ◽

Data Pair ◽

Data Points ◽

Local Patterns ◽

Data Adaptive ◽

Thresholding Algorithm ◽

Projection Matrices ◽

Adaptive Metric

The central problem for most existing metric learning methods is to find a suitable projection matrix on the differences of all pairs of data points. However, a single unified projection matrix can hardly characterize all data similarities accurately as the practical data are usually very complicated, and simply adopting one global projection matrix might ignore important local patterns hidden in the dataset. To address this issue, this paper proposes a novel method dubbed “Data-Adaptive Metric Learning” (DAML), which constructs a data-adaptive projection matrix for each data pair by selectively combining a set of learned candidate matrices. As a result, every data pair can obtain a specific projection matrix, enabling the proposed DAML to flexibly fit the training data and produce discriminative projection results. The model of DAML is formulated as an optimization problem which jointly learns candidate projection matrices and their sparse combination for every data pair. Nevertheless, the over-fitting problem may occur due to the large amount of parameters to be learned. To tackle this issue, we adopt the Total Variation (TV) regularizer to align the scales of data embedding produced by all candidate projection matrices, and thus the generated metrics of these learned candidates are generally comparable. Furthermore, we extend the basic linear DAML model to the kernerlized version (denoted “KDAML”) to handle the non-linear cases, and the Iterative Shrinkage-Thresholding Algorithm (ISTA) is employed to solve the optimization model. Intensive experimental results on various applications including retrieval, classification, and verification clearly demonstrate the superiority of our algorithm to other state-of-the-art metric learning methodologies.

Download Full-text