Supervector Dimension Reduction for Efficient Speaker Age Estimation Based on the Acoustic Speech Signal

The information imbedded in the visual dynamics of speech has the potential to improve the performance of speech and speaker recognition systems. The information carried in the visual speech signal compliments the information in the acoustic speech signal, which is particularly beneficial in adverse acoustic environments. Non-invasive methods using low-cost sensors can be used to obtain acoustic and visual biometric signals, such as a person’s voice and lip movement, with little user cooperation. These types of unobtrusive biometric systems are warranted to promote widespread adoption of biometric technology in today’s society. In this chapter, the authors describe the main components and theory of audio-visual and visual-only speech and speaker recognition systems. Audio-visual corpora are described and a number of speech and speaker recognition systems are reviewed. Finally, various open issues about the system design and implementation, and present future research and development directions in this area are discussed.

Download Full-text

Manifestations of Task‐Induced Stress in the Acoustic Speech Signal

The Journal of the Acoustical Society of America ◽

10.1121/1.1911241 ◽

1968 ◽

Vol 44 (4) ◽

pp. 993-1001 ◽

Cited By ~ 52

Author(s):

Michael H. L. Hecker ◽

Kenneth N. Stevens ◽

Gottfried von Bismarck ◽

Carl E. Williams

Keyword(s):

Speech Signal ◽

Induced Stress ◽

Acoustic Speech Signal

Download Full-text

Individual Identification Through Voice Using Mel-Frequency Cepstrum Coefficient (MFCC) and Hidden Markov Models (HMM) Method

Journal of Measurements Electronics Communications and Systems ◽

10.25124/jmecs.v7i1.3553 ◽

2020 ◽

Vol 7 (1) ◽

pp. 26

Author(s):

Dea Sifana Ramadhina ◽

Rita Magdalena ◽

Sofia Saidah

Keyword(s):

Hidden Markov Models ◽

Speaker Recognition ◽

Speech Signal ◽

Markov Models ◽

Hidden Markov ◽

Recognition System ◽

Individual Identification ◽

Acoustic Speech Signal ◽

Mel Frequency Cepstrum Coefficient ◽

The Voice

Voice is one of the parameters in the identification process of a person. Through the voice, information will be obtained such as gender, age, and even the identity of the speaker. Speaker recognition is a method to narrow down crimes and frauds committed by voice. So that it will minimize the occurrence of faking one's identity. The Method of Mel Frequency Cepstrum Coefficient (MFCC) can be used in the speech recognition system. The process of feature extraction of speech signal using MFCC will produce acoustic speech signal. The classification, Hidden Markov Models (HMM) is used to match unidentified speaker’s voice with the voices in database. In this research, the system is used to verify the speaker, namely 15 text dependent in Indonesian. On testing the speaker with the same as database, the highest accuracy is 99,16%.

Download Full-text

Dimension Reduction Analysis of Vowel Signal Data Based on Manifold Learning

WSEAS TRANSACTIONS ON ADVANCES in ENGINEERING EDUCATION ◽

10.37394/232010.2021.18.13 ◽

2021 ◽

Vol 18 ◽

pp. 148-151

Author(s):

Jinqing Shen ◽

Zhongxiao Li ◽

Xiaodong Zhuang

Keyword(s):

Dimension Reduction ◽

Manifold Learning ◽

Speech Signal ◽

Reduction Method ◽

High Dimensional ◽

Speech Signal Processing ◽

Dimension Reduction Method ◽

Important Method ◽

Signal Features ◽

Reduction Methods

Data dimension reduction is an important method to overcome dimension disaster and obtain as much valuable information as possible. Speech signal is a kind of non-stationary random signal with high redundancy, and proper dimension reduction methods are needed to extract and analyze the signal features efficiently in speech signal processing. Studies have shown that manifold structure exists in high-dimensional data. Manifold dimension reduction method aiming at discovering the intrinsic geometric structure of data may be more effective in dealing with practical problems. This paper studies a data dimension reduction method based on manifold learning and applies it to the analysis of vowel signals.

Download Full-text

Bimodal classification of English allophones employing acoustic speech signal and facial motion capture

The Journal of the Acoustical Society of America ◽

10.1121/1.5067951 ◽

2018 ◽

Vol 144 (3) ◽

pp. 1801-1802

Author(s):

Andrzej Czyzewski ◽

Szymon Zaporowski ◽

Bozena Kostek

Keyword(s):

Motion Capture ◽

Speech Signal ◽

Facial Motion ◽

Acoustic Speech Signal

Download Full-text

Speech Production During Mechanical Ventilation in Tracheostomized Individuals

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3701.53 ◽

1994 ◽

Vol 37 (1) ◽

pp. 53-63 ◽

Cited By ~ 23

Author(s):

Jeannette D. Hoit ◽

Steven A. Shea ◽

Robert B. Banzett

Keyword(s):

Mechanical Ventilation ◽

Speech Production ◽

Chest Wall ◽

Muscle Activity ◽

Speech Signal ◽

Pressure Wave ◽

Neck Muscle ◽

Blood Gas ◽

Tracheal Pressure ◽

Acoustic Speech Signal

This investigation provides the first detailed description of speech production during mechanical ventilation. Seven adults with tracheostomies served as subjects. Recordings were made of chest wall motions, neck muscle activity, tracheal pressure, air flow at the nose and mouth, estimated blood-gas levels, and the acoustic speech signal during performance of a variety of speech tasks. Results indicated that subjects spoke for short durations that spanned all phases of the ventilator cycle, altered laryngeal opposing pressures in response to the continually changing tracheal pressure wave, and expended relatively small volumes of gas for speech production. Speech was improved by making selected ventilator adjustments. Suggestions for clinical interventions are offered.

Download Full-text

Possibility to extract information on an acoustic speech signal from reflected laser radiation

2015 Days on Diffraction (DD) ◽

10.1109/dd.2015.7354841 ◽

2015 ◽

Author(s):

Larisa A. Glushchenko ◽

Alexander M. Korzun ◽

Victor I. Tupota ◽

Vadim Ya. Krohalev

Keyword(s):

Laser Radiation ◽

Speech Signal ◽

Acoustic Speech Signal ◽

Extract Information

Download Full-text

A New Perspective on Developmental Language Problems: Perceptual Organization Deficits

Perspectives on Language Learning and Education ◽

10.1044/lle19.3.87 ◽

2012 ◽

Vol 19 (3) ◽

pp. 87-97 ◽

Cited By ~ 3

Author(s):

Susan Nittrouer

Keyword(s):

Perceptual Organization ◽

Auditory Processing ◽

Speech Signal ◽

Auditory Comprehension ◽

Sensory Inputs ◽

Language Problems ◽

Acoustic Speech Signal ◽

Processing Deficits ◽

New Perspective ◽

Experience Difficulty

Children with a variety of language-related problems, including dyslexia, experience difficulty processing the acoustic speech signal, leading to proposals of diagnostic entities known as auditory processing deficits. Although descriptions of these deficits vary across accounts, most hinge on the idea that problems arise at the level of detecting and/or discriminating sensory inputs. In this article, the author re-examines that idea and proposes that the difficulty more likely arises in how those sensations get organized into service for auditory comprehension of language.

Download Full-text