GuidedMix: An on‐the‐fly data augmentation approach for robust speaker recognition system

Abstract It is known that large-scale training data can get the better effect of recognition. However, it is difficult to collect a lot of labeled training data for speaker recognition. At the same time, the performance of speaker recognition is greatly influenced by environmental noise. In this paper, we use data augmentation by adding noise to get much training data and improve the robustness of speaker recognition. The experimental results demonstrate that data augmentation have the better performance improvement on Chinese-863 database.

Download Full-text

Multitaper Based MFCC Feature Extraction for Robust Speaker Recognition System

2019 Innovations in Power and Advanced Computing Technologies (i-PACT) ◽

10.1109/i-pact44901.2019.8960206 ◽

2019 ◽

Cited By ~ 2

Author(s):

K.P. Bharath ◽

Rajesh Kumar M.

Keyword(s):

Feature Extraction ◽

Speaker Recognition ◽

Recognition System ◽

Robust Speaker Recognition

Download Full-text

Cost-Sensitive Learning for Emotion Robust Speaker Recognition

The Scientific World JOURNAL ◽

10.1155/2014/628516 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 6

Author(s):

Dongdong Li ◽

Yingchun Yang ◽

Weihui Dai

Keyword(s):

Speaker Recognition ◽

Recognition System ◽

Learning Technology ◽

Speech Corpus ◽

Voice Communication ◽

Cost Sensitive Learning ◽

Identification Rate ◽

Telephone System ◽

Robust Speaker Recognition ◽

Voice Data

In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

Download Full-text

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/3/3 ◽

2020 ◽

Vol 17 (3) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Riaz Ahmad ◽

Saeeda Naz ◽

Muhammad Afzal ◽

Sheikh Rashid ◽

Marcus Liwicki ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Data Augmentation ◽

Short Term Memory ◽

Recognition System ◽

Learning Approach ◽

Arabic Text ◽

Data Set ◽

Processing Step ◽

Handwritten Arabic

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Download Full-text

GuidedMix: An on‐the‐fly data augmentation approach for robust speaker recognition system

Emotional Speech Clustering Based Robust Speaker Recognition System

Robust speaker recognition system employing covariance matrix and Eigenvoice

A robust speaker recognition based on Data augmentation

Segmental Analysis of Speech Signal for Robust Speaker Recognition System

Binaural Classification-Based Speech Segregation and Robust Speaker Recognition System

A robust speaker recognition system combining factor analysis techniques

A robust speaker recognition based on Data augmentation

Multitaper Based MFCC Feature Extraction for Robust Speaker Recognition System

Cost-Sensitive Learning for Emotion Robust Speaker Recognition

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

Export Citation Format