AN EFFICIENT SPEECH GENERATIVE MODEL BASED ON DETERMINISTIC/STOCHASTIC SEPARATION OF SPECTRAL ENVELOPES

The paper presents a speech generative model that provides an efficient way of generating speech waveform from its amplitude spectral envelopes. The model is based on hybrid speech representation that includes deterministic (harmonic) and stochastic (noise) components. The main idea behind the approach originates from the fact that speech signal has a determined spectral structure that is statistically bound with deterministic/stochastic energy distribution in the spectrum. The performance of the model is evaluated using an experimental low-bitrate wide-band speech coder. The quality of reconstructed speech is evaluated using objective and subjective methods. Two objective quality characteristics were calculated: Modified Bark Spectral Distortion (MBSD) and Perceptual Evaluation of Speech Quality (PESQ). Narrow-band and wide-band versions of the proposed solution were compared with MELP (Mixed Excitation Linear Prediction) speech coder and AMR (Adaptive Multi-Rate) speech coder, respectively. The speech base of two female and two male speakers were used for testing. The performed tests show that overall performance of the proposed approach is speaker-dependent and it is better for male voices. Supposedly, this difference indicates the influence of pitch highness on separation accuracy. In that way, using the proposed approach in experimental speech compression system provides decent MBSD values and comparable PESQ values with AMR speech coder at 6,6 kbit/s. Additional subjective listening testsdemonstrate that the implemented coding system retains phonetic content and speaker’s identity. It proves consistency of the proposed approach.

Download Full-text

An intelligibility enhancement for the mixed excitation linear prediction speech coder

IEEE Signal Processing Letters ◽

10.1109/lsp.2003.815617 ◽

2003 ◽

Vol 10 (9) ◽

pp. 263-266 ◽

Cited By ~ 6

Author(s):

N.R. Chong-White ◽

R.V. Cox

Keyword(s):

Linear Prediction ◽

Mixed Excitation Linear Prediction ◽

Speech Coder

Download Full-text

Implementation and Design of Underwater Duplex Speech Communication System Based on Vector Hydrophone

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.336-338.1939 ◽

2013 ◽

Vol 336-338 ◽

pp. 1939-1944

Author(s):

Hua Chao Cui ◽

An Bang Zhao ◽

Bin Zhou ◽

Kun Ping Sun ◽

Yan Cui

Keyword(s):

Linear Prediction ◽

Error Correcting Code ◽

Local Source ◽

Signal Acquisition ◽

Speech Communication ◽

Speech Compression ◽

Vector Hydrophone ◽

Tank Test ◽

Stable Performance ◽

Mixed Excitation Linear Prediction

This paper realizes 1.2kbps Mixed Excitation Linear Prediction (MELP) speech compression algorithm through multi-frame joint and Vector Quantization technology. In order to adapt to underwater acoustic channel, we apply the Orthogonal Frequency Division Modulation (OFDM) to modulate the source bits, together with synchronization, channel estimation and RS error correcting code technology. The hardware platform of this system relies on TMS320DM642 and AIC23. Moreover, we introduce single vector hydrophone to restrain the local source. The system finished duplex speech signal acquisition and transmission and tank test results prove that the synthesized speech is clear and understandable and it has a stable performance.

Download Full-text