Advanced voice activity detection on mobile phones by using microphone array and phoneme-specific Gaussian mixture models

AbstractThe performance of most of the state-of-the-art speaker recognition (SR) systems deteriorates under degraded conditions, owing to mismatch between the training and testing sessions. This study focuses on the front end of the speaker verification (SV) system to reduce the mismatch between training and testing. An adaptive voice activity detection (VAD) algorithm using zero-frequency filter assisted peaking resonator (ZFFPR) was integrated into the front end of the SV system. The performance of this proposed SV system was studied under degraded conditions with 50 selected speakers from the NIST 2003 database. The degraded condition was simulated by adding different types of noises to the original speech utterances. The different types of noises were chosen from the NOISEX-92 database to simulate degraded conditions at signal-to-noise ratio levels from 0 to 20 dB. In this study, widely used 39-dimension Mel frequency cepstral coefficient (MFCC; i.e., 13-dimension MFCCs augmented with 13-dimension velocity and 13-dimension acceleration coefficients) features were used, and Gaussian mixture model–universal background model was used for speaker modeling. The proposed system’s performance was studied against the energy-based VAD used as the front end of the SV system. The proposed SV system showed some encouraging results when EMD-based VAD was used at its front end.

Download Full-text

Spatial voice activity detection using a microphone array noise canceller

The Journal of the Acoustical Society of America ◽

10.1121/1.425151 ◽

1999 ◽

Vol 105 (2) ◽

pp. 1100-1100

Author(s):

Michael W. Hoffman ◽

Zhao Li

Keyword(s):

Microphone Array ◽

Voice Activity Detection ◽

Activity Detection ◽

Voice Activity

Download Full-text

Fault Detection in a Microphone Array by Intercorrelation of Features in Voice Activity Detection

IEEE Transactions on Industrial Electronics ◽

10.1109/tie.2010.2062481 ◽

2011 ◽

Vol 58 (6) ◽

pp. 2568-2571 ◽

Cited By ~ 8

Author(s):

Jinsung Kim ◽

Bum-Jae You

Keyword(s):

Fault Detection ◽

Microphone Array ◽

Voice Activity Detection ◽

Activity Detection ◽

Voice Activity

Download Full-text

Multi-speaker voice activity detection using a camera-assisted microphone array

2016 International Conference on Systems, Signals and Image Processing (IWSSIP) ◽

10.1109/iwssip.2016.7502768 ◽

2016 ◽

Cited By ~ 3

Author(s):

Trond F. Bergh ◽

Ines Hafizovic ◽

Sverre Holm

Keyword(s):

Microphone Array ◽

Voice Activity Detection ◽

Activity Detection ◽

Voice Activity

Download Full-text

Voice activity detection using phase vector in microphone array

Electronics Letters ◽

10.1049/el:20070780 ◽

2007 ◽

Vol 43 (14) ◽

pp. 783 ◽

Cited By ~ 11

Author(s):

G. Kim ◽

N.I. Cho

Keyword(s):

Microphone Array ◽

Voice Activity Detection ◽

Phase Vector ◽

Activity Detection ◽

Voice Activity

Download Full-text

Energy-based multi-speaker voice activity detection with an ad hoc microphone array

2010 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2010.5496183 ◽

2010 ◽

Cited By ~ 22

Author(s):

Alexander Bertrand ◽

Marc Moonen

Keyword(s):

Ad Hoc ◽

Microphone Array ◽

Voice Activity Detection ◽

Activity Detection ◽

Voice Activity

Download Full-text

Voice activity detection using the phase vector in microphone array

10.21437/interspeech.2007-737 ◽

2007 ◽

Author(s):

Gibak Kim ◽

Nam Ik Cho

Keyword(s):

Microphone Array ◽

Voice Activity Detection ◽

Phase Vector ◽

Activity Detection ◽

Voice Activity

Download Full-text

Expanded VAD Guided Subdivision of Cardiopulmonary Sounds

Revista Ingeniería Biomédica ◽

10.24050/19099762.n25.2019.1317 ◽

2019 ◽

Vol 13 (25) ◽

Author(s):

Julio Alejandro Valdez Gonzalez ◽

Pedro Mayorga Ortiz ◽

Christopher Druzgalski ◽

Vesna Zeljkovic ◽

Gilberto Chavez ◽

...

Keyword(s):

Statistical Approach ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Medical Professional ◽

Heart Sounds ◽

Activity Detection ◽

Lung Sound ◽

Mel Frequency Cepstral Coefficients ◽

The Voice ◽

Voice Activity

Cardiopulmonary auscultation is a diagnostic procedure that has a challenging task since the components of heart rate and lung sounds overlap. There were many approaches to quantify the characteristics of these signals, and one of the newest is the voice activity detection (VAD) and the Gaussian Mixture Models (GMM). Considering the lung and heart sounds as acoustic events, this paper proposes a novel assessment methodology of these diagnostic indicators. Here, VAD-GMM was applied to detect and extract the main events in lung sound and heart sounds. VAD-GMM results were compared with other VAD methodology based on statistical approach, and it was found that VAD-GMM give more definite results. Since Mel Frequency Cepstral coefficients (MFCC) and Quartiles feature vectors, were already successful in pattern recognition, VAD-GMM was carried out using this kind of acoustic vectors. Therefore, this method could add in a transition from qualitative traditional auscultation to quantitative assessment and assisted computerized diagnosis by identifying abnormal acoustic indicators. Diagnosis by computerized detection promises to be a more efficient method than traditional methods, which are limited by the auditory capability and experience of a medical professional.

Download Full-text

Voice activity detection method based on inter-frame correlation

Journal of Computer Applications ◽

10.3724/sp.j.1087.2011.01447 ◽

2011 ◽

Vol 31 (5) ◽

pp. 1447-1449

Author(s):

Yu LI ◽

Lei-yong GUO ◽

Hong-zhou TAN

Keyword(s):

Detection Method ◽

Voice Activity Detection ◽

Activity Detection ◽

Voice Activity ◽

Inter Frame

Download Full-text