Supervector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal

2011 ◽  
Vol 19 (7) ◽  
pp. 1975-1985 ◽  
Author(s):  
Gil Dobry ◽  
Ron M. Hecht ◽  
Mireille Avigal ◽  
Yaniv Zigel
2009 ◽  
Author(s):  
Gil Dobry ◽  
Ron M. Hecht ◽  
Mireille Avigal ◽  
Yaniv Zigel

2009 ◽  
pp. 1-38 ◽  
Author(s):  
Derek J. Shiell ◽  
Louis H. Terry ◽  
Petar S. Aleksic ◽  
Aggelos K. Katsaggelos

The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person’s voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today’s society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed.


1968 ◽  
Vol 44 (4) ◽  
pp. 993-1001 ◽  
Author(s):  
Michael H. L. Hecker ◽  
Kenneth N. Stevens ◽  
Gottfried von Bismarck ◽  
Carl E. Williams

Author(s):  
Dea Sifana Ramadhina ◽  
Rita Magdalena ◽  
Sofia Saidah

Voice is one of the parameters in the identification process of a person. Through the voice, information will be obtained such as gender, age, and even the identity of the speaker. Speaker recognition is a method to narrow down crimes and frauds committed by voice. So that it will minimize the occurrence of faking one's identity. The Method of Mel Frequency Cepstrum Coefficient (MFCC) can be used in the speech recognition system. The process of feature extraction of speech signal using MFCC will produce acoustic speech signal. The classification, Hidden Markov Models (HMM) is used to match unidentified speaker’s voice with the voices in database. In this research, the system is used to verify the speaker, namely 15 text dependent in Indonesian. On testing the speaker with the same as database, the highest accuracy is 99,16%.


2021 ◽  
Vol 18 ◽  
pp. 148-151
Author(s):  
Jinqing Shen ◽  
Zhongxiao Li ◽  
Xiaodong Zhuang

Data dimension reduction is an important method to overcome dimension disaster and obtain as much valuable information as possible. Speech signal is a kind of non-stationary random signal with high redundancy, and proper dimension reduction methods are needed to extract and analyze the signal features efficiently in speech signal processing. Studies have shown that manifold structure exists in high-dimensional data. Manifold dimension reduction method aiming at discovering the intrinsic geometric structure of data may be more effective in dealing with practical problems. This paper studies a data dimension reduction method based on manifold learning and applies it to the analysis of vowel signals.


1994 ◽  
Vol 37 (1) ◽  
pp. 53-63 ◽  
Author(s):  
Jeannette D. Hoit ◽  
Steven A. Shea ◽  
Robert B. Banzett

This investigation provides the first detailed description of speech production during mechanical ventilation. Seven adults with tracheostomies served as subjects. Recordings were made of chest wall motions, neck muscle activity, tracheal pressure, air flow at the nose and mouth, estimated blood-gas levels, and the acoustic speech signal during performance of a variety of speech tasks. Results indicated that subjects spoke for short durations that spanned all phases of the ventilator cycle, altered laryngeal opposing pressures in response to the continually changing tracheal pressure wave, and expended relatively small volumes of gas for speech production. Speech was improved by making selected ventilator adjustments. Suggestions for clinical interventions are offered.


Author(s):  
Larisa A. Glushchenko ◽  
Alexander M. Korzun ◽  
Victor I. Tupota ◽  
Vadim Ya. Krohalev

2012 ◽  
Vol 19 (3) ◽  
pp. 87-97 ◽  
Author(s):  
Susan Nittrouer

Children with a variety of language-related problems, including dyslexia, experience difficulty processing the acoustic speech signal, leading to proposals of diagnostic entities known as auditory processing deficits. Although descriptions of these deficits vary across accounts, most hinge on the idea that problems arise at the level of detecting and/or discriminating sensory inputs. In this article, the author re-examines that idea and proposes that the difficulty more likely arises in how those sensations get organized into service for auditory comprehension of language.


Sign in / Sign up

Export Citation Format

Share Document