A Study on the Robustness of Pitch Range Estimation from Brief Speech Segments

Author(s):  
Wenjie Peng ◽  
Kaiqi Fu ◽  
Wei Zhang ◽  
Yanlu Xie ◽  
Jinsong Zhang
2020 ◽  
Vol 30 (01) ◽  
pp. 2050003
Author(s):  
Wenjie Peng ◽  
Kaiqi Fu ◽  
Wei Zhang ◽  
Yanlu Xie ◽  
Jinsong Zhang

Pitch-range estimation from brief speech segments could bring benefits to many tasks like automatic speech recognition and speaker recognition. To estimate pitch range, previous studies have proposed to utilize deep-learning-based models with spectrum information as input. They demonstrated that such method works and could still achieve reliable estimation results when the speech segment is as brief as 300 ms. In this study, we evaluated the robustness of this method. We take the following scenarios into account: (1) a large number of training speakers; (2) different language backgrounds; and (3) monosyllabic utterances with different tones. Experimental results showed that: (1) The use of a large number of training speakers improved the estimation accuracies. (2) The mean absolute percentage error (MAPE) rate evaluated on the L2 speakers is similar to that on the native speakers. (3) Different tonal information will affect the LSTM-based model, but this influence is limited compared to the baseline method which calculates pitch-range targets from the distribution of [Formula: see text]0 values. These experimental results verified the efficiency of the LSTM-based pitch-range estimation method.


Author(s):  
Qi Zhang ◽  
Chong Cao ◽  
Tiantian Li ◽  
Yanlu Xie ◽  
Jinsong Zhang
Keyword(s):  

1970 ◽  
Author(s):  
Robert L. Hilgendorf ◽  
John C. Simons
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document