Speaker-independent HMM-based voice conversion using quantized fundamental frequency

Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency

Speech Communication ◽

10.1016/j.specom.2011.05.001 ◽

2011 ◽

Vol 53 (7) ◽

pp. 973-985 ◽

Cited By ~ 8

Author(s):

Takashi Nose ◽

Takao Kobayashi

Keyword(s):

Fundamental Frequency ◽

Voice Conversion ◽

Adaptive Quantization ◽

Speaker Independent

Download Full-text

Increasing the Intelligibility and Naturalness of Alaryngeal Speech Using Voice Conversion and Synthetic Fundamental Frequency

10.21437/interspeech.2020-1196 ◽

2020 ◽

Author(s):

Tuan Dinh ◽

Alexander Kain ◽

Robin Samlan ◽

Beiming Cao ◽

Jun Wang

Keyword(s):

Fundamental Frequency ◽

Voice Conversion ◽

Alaryngeal Speech

Download Full-text

Method for analyzing fundamental frequency information and voice conversion method and system implementing said analysis method

The Journal of the Acoustical Society of America ◽

10.1121/1.3455408 ◽

2010 ◽

Vol 127 (6) ◽

pp. 3878

Author(s):

Taoufik En-Najjary ◽

Olivier Rosec

Keyword(s):

Fundamental Frequency ◽

Voice Conversion ◽

Analysis Method ◽

Frequency Information ◽

Conversion Method

Download Full-text

Voice conversion in time-invariant speaker-independent space

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2014.6855136 ◽

2014 ◽

Cited By ~ 5

Author(s):

Toru Nakashika ◽

Tetsuya Takiguchi ◽

Yasuo Ariki

Keyword(s):

Voice Conversion ◽

Speaker Independent ◽

Time Invariant

Download Full-text

Fundamental frequency modeling using wavelets for emotional voice conversion

2015 International Conference on Affective Computing and Intelligent Interaction (ACII) ◽

10.1109/acii.2015.7344665 ◽

2015 ◽

Cited By ~ 15

Author(s):

Huaiping Ming ◽

Dongyan Huang ◽

Minghui Dong ◽

Haizhou Li ◽

Lei Xie ◽

...

Keyword(s):

Fundamental Frequency ◽

Voice Conversion

Download Full-text

Voice Conversion Using Pitch Shifting Algorithm by Time Stretching with PSOLA and Re-Sampling

Journal of Electrical Engineering ◽

10.2478/v10187-010-0008-5 ◽

2010 ◽

Vol 61 (1) ◽

pp. 57-61 ◽

Cited By ~ 15

Author(s):

Allam Mousa

Keyword(s):

Fundamental Frequency ◽

Speech Signal ◽

Voice Conversion ◽

Inverse Filter ◽

Pitch Period ◽

Male And Female ◽

Single Frame ◽

Female Speech ◽

Time Stretching

Voice Conversion Using Pitch Shifting Algorithm by Time Stretching with PSOLA and Re-SamplingVoice changing has many applications in the industry and commercial filed. This paper emphasizes voice conversion using a pitch shifting method which depends on detecting the pitch of the signal (fundamental frequency) using Simplified Inverse Filter Tracking (SIFT) and changing it according to the target pitch period using time stretching with Pitch Synchronous Over Lap Add Algorithm (PSOLA), then resampling the signal in order to have the same play rate. The same study was performed to see the effect of voice conversion when some Arabic speech signal is considered. Treatment of certain Arabic voiced vowels and the conversion between male and female speech has shown some expansion or compression in the resulting speech. Comparison in terms of pitch shifting is presented here. Analysis was performed for a single frame and a full segmentation of speech.

Download Full-text

Method for analyzing fundamental frequency information and voice conversion method and system implementing said analysis method

The Journal of the Acoustical Society of America ◽

10.1121/1.3457088 ◽

2010 ◽

Vol 127 (6) ◽

pp. 3869

Author(s):

Taoufik En-Najjary ◽

Olivier Rosec

Keyword(s):

Fundamental Frequency ◽

Voice Conversion ◽

Analysis Method ◽

Frequency Information ◽

Conversion Method

Download Full-text

Speaker‐independent vowel classification based on fundamental frequency and formant frequencies

The Journal of the Acoustical Society of America ◽

10.1121/1.2024478 ◽

1987 ◽

Vol 81 (S1) ◽

pp. S93-S93

Author(s):

James Hillenbrand ◽

Robert T. Gayvert

Keyword(s):

Fundamental Frequency ◽

Formant Frequencies ◽

Speaker Independent

Download Full-text

Complex Cepstrum Based Voice Conversion Using Radial Basis Function

ISRN Signal Processing ◽

10.1155/2014/357048 ◽

2014 ◽

Vol 2014 ◽

pp. 1-13 ◽

Cited By ~ 10

Author(s):

Jagannath Nirmal ◽

Suprava Patnaik ◽

Mukesh Zaveri ◽

Pramod Kachare

Keyword(s):

Radial Basis Function ◽

Fundamental Frequency ◽

Basis Function ◽

Speech Signal ◽

Vocal Tract ◽

Voice Conversion ◽

Radial Basis ◽

Complex Cepstrum ◽

Source Excitation ◽

Mel Cepstrum

The complex cepstrum vocoder is used to modify the speaker specific characteristics of the source speaker speech to that of the target speaker speech. The low time and high time liftering are used to split the calculated cepstrum into the vocal tract and the source excitation parameters. The obtained mixed phase vocal tract and source excitation parameters with finite impulse response preserve the phase properties of the resynthesized speech frame. The radial basis function is explored to capture the nonlinear mapping function for modifying the complex cepstrum based real and imaginary components of the vocal tract and source excitation of the speech signal. The state-of-the-art Mel cepstrum envelope and the fundamental frequency (F0) are considered to represent the vocal tract and the source excitation of the speech frame, respectively. Radial basis function is used to capture and formulate the nonlinear relations between the Mel cepstrum envelope of the source and target speakers. Mean and standard deviation approach is employed to modify the fundamental frequency (F0). The Mel log spectral approximation filter is used to reconstruct the speech signal from the modified Mel cepstrum envelope and fundamental frequency. A comparison of the proposed complex cepstrum based model has been made with the state-of-the-art Mel Cepstrum Envelope based voice conversion model with objective and subjective evaluations. The evaluation measures reveal that the proposed complex cepstrum based voice conversion system approximate the converted speech signal with better accuracy than the model based on the Mel cepstrum envelope based voice conversion.

Download Full-text

GMM FOR EMOTION RECOGNITION OF VIETNAMESE

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/33/3/11017 ◽

2018 ◽

Vol 33 (3) ◽

pp. 229-246

Author(s):

Đào Thị Lệ Thủy ◽

Trinh Van Loan ◽

Nguyen Hong Quang

Keyword(s):

Emotion Recognition ◽

Fundamental Frequency ◽

Spectral Characteristics ◽

Average Score ◽

Speech Signals ◽

Characteristic Parameters ◽

Basic Emotions ◽

Second Derivatives ◽

Speaker Independent

This paper presents the results of GMM-based recognition for four basic emotions of Vietnamese such as neutral, sadness, anger and happiness. The characteristic parameters of these emotions are extracted from speech signals and divided into different parameter sets for experiments. The experiments are carried out according to speaker-dependent or speaker-independent and content-dependent or content-independent recognitions. The results showed that the recognition scores are rather high with the case for which there is a full combination of parameters as MFCC and its first and second derivatives, fundamental frequency, energy, formants and its correspondent bandwidths, spectral characteristics and F0 variants. In average, the speaker-dependent and content-dependent recognition scrore is 89.21%. Next, the average score is 82.27% for the speaker-dependent and content-independent recognition. For the speaker-independent and content-dependent recognition, the average score is 70.35%. The average score is 66.99% for speaker-independent and content-independent recognition. Information on F0 has significantly increased the score of recognition

Download Full-text