Effect of High-Energy Voiced Speech Segments and Speaker Gender on Shouted Speech Detection

Author(s):  
Shikha Baghel ◽  
S. R. M. Prasanna ◽  
Prithwijit Guha
2015 ◽  
Vol 39 (2) ◽  
pp. 231-242 ◽  
Author(s):  
Marek Blok ◽  
Piotr Drózda

Abstract In this paper a sample rate conversion algorithm which allows for continuously changing resampling ratio has been presented. The proposed implementation is based on a variable fractional delay filter which is implemented by means of a Farrow structure. Coefficients of this structure are computed on the basis of fractional delay filters which are designed using the offset window method. The proposed approach allows us to freely change the instantaneous resampling ratio during processing. Using such an algorithm we can simulate recording of audio on magnetic tape with nonuniform velocity as well as remove such distortions. We have demonstrated capabilities of the proposed approach based on the example of speech signal processing with a resampling ratio which was computed on the basis of estimated fundamental frequency of voiced speech segments.


2006 ◽  
Vol 65 (7) ◽  
pp. 655-665
Author(s):  
A. Mantilla-Caeiros ◽  
Hector Manuel Perez-Meana ◽  
D. Mata-Verde ◽  
C. Angeles-Pina ◽  
J. Alvarado-Soriano ◽  
...  

Sensors ◽  
2009 ◽  
Vol 9 (12) ◽  
pp. 9858-9872
Author(s):  
Krzysztof Ślot ◽  
Łukasz Bronakowski ◽  
Jaroslaw Cichosz ◽  
Hyongsuk Kim

Author(s):  
Wajdi Ghezaiel ◽  
Amel Ben Slimane ◽  
Ezzedine Ben Braiek

<p>Usable speech is a novel concept of processing co-channel speech data. It is proposed to extract minimally corrupted speech that is considered useful for various speech processing systems. In this paper, we are interested for co-channel speaker identification (SID). We employ a new proposed usable speech extraction method based on the pitch information obtained from linear multi-scale decomposition by discrete wavelet transform. The idea is to retain the speech segments that have only one pitch detected and remove the others. Detected Usable speech was used as input for speaker identification system. The system is evaluated on co-channel speech and results show a significant improvement across various Target to Interferer Ratio (TIR) for speaker identification system.</p>


Author(s):  
Wajdi Ghezaiel ◽  
Amel Ben Slimane ◽  
Ezzedine Ben Braiek

<p>Usable speech is a novel concept of processing co-channel speech data. It is proposed to extract minimally corrupted speech that is considered useful for various speech processing systems. In this paper, we are interested for co-channel speaker identification (SID). We employ a new proposed usable speech extraction method based on the pitch information obtained from linear multi-scale decomposition by discrete wavelet transform. The idea is to retain the speech segments that have only one pitch detected and remove the others. Detected Usable speech was used as input for speaker identification system. The system is evaluated on co-channel speech and results show a significant improvement across various Target to Interferer Ratio (TIR) for speaker identification system.</p>


Author(s):  
Mihir Narayan Mohanty ◽  
Aurobinda Routray ◽  
Prithviraj Kabisatpathy

Detection of Voice in speech signal is a challenging problem in developing high-performance systems used in noisy environments. In this paper, we present an efficient algorithm for robust voiced speech detection and for the application to variable-rate speech coding. The key idea of the algorithm is considering speech energy and zero crossings rate (ZCR) information simultaneously when processing speech signals and finding the end point of the signal. Next to it a decision rule and a background noise statistics estimator, by applying a statistical model. A robust decision rule is derived from the generalized likelihood ratio test (LRT) by assuming that the noise statistics are known a priori. The algorithm is most efficient for the time-varying noise. According to our simulation results, the proposed algorithm shows significantly better performance in low signal-to-noise ratio and in noisy environments.


Sign in / Sign up

Export Citation Format

Share Document