A novel i-vector framework using multiple features and PCA for speaker recognition in short speech condition

The purpose of this investigation was twofold: (a) to determine if there are changes in specific temporal characteristics of speech that occur during simultaneous communication, and (b) to determine if known temporal rules of spoken English are disrupted during simultaneous communication. Ten speakers uttered sentences consisting of a carrier phrase and experimental CVC words under conditions of: (a) speech, (b) speech combined with signed English, and (c) speech combined with signed English for every word except the CVC word that was fingerspelled. The temporal features investigated included: (a) sentence duration, (b) experimental CVC word duration, (c) vowel duration in experimental CVC words, (d) pause duration before and after experimental CVC words, and (e) consonantal effects on vowel duration. Results indicated that for all durational measures, the speech/sign/fingerspelling condition was longest, followed by the speech/sign condition, with the speech condition being shortest. It was also found that for all three speaking conditions, vowels were longer in duration when preceding voiced consonants than vowels preceding their voiceless cognates, and that a low vowel was longer in duration than a high vowel. These findings indicate that speakers consistently reduced their rate of speech when using simultaneous communication, but did not violate these specific temporal rules of English important for consonant and vowel perception.

Download Full-text

Machine Learning for Speaker Recognition

10.1017/9781108552332 ◽

2020 ◽

Cited By ~ 2

Author(s):

Man-Wai Mak ◽

Jen-Tzung Chien

Keyword(s):

Machine Learning ◽

Speaker Recognition

Download Full-text

Towards a Validation of Multiple Features in the Assessment of Emotions

European Journal of Psychological Assessment ◽

10.1027/1015-5759.14.3.202 ◽

1998 ◽

Vol 14 (3) ◽

pp. 202-210 ◽

Cited By ~ 3

Author(s):

Suzanne Skiffington ◽

Ephrem Fernandez ◽

Ken McFarland

Keyword(s):

Factor Analysis ◽

Pilot Study ◽

Cognitive Appraisal ◽

Negative Emotions ◽

The Self ◽

Factor Analyses ◽

Data Set ◽

Multiple Features ◽

Action Tendency ◽

Intensity Range

This study extends previous attempts to assess emotion with single adjective descriptors, by examining semantic as well as cognitive, motivational, and intensity features of emotions. The focus was on seven negative emotions common to several emotion typologies: anger, fear, sadness, shame, pity, jealousy, and contempt. For each of these emotions, seven items were generated corresponding to cognitive appraisal about the self, cognitive appraisal about the environment, action tendency, action fantasy, synonym, antonym, and intensity range of the emotion, respectively. A pilot study established that 48 of the 49 items were linked predominantly to the specific emotions as predicted. The main data set comprising 700 subjects' ratings of relatedness between items and emotions was subjected to a series of factor analyses, which revealed that 44 of the 49 items loaded on the emotion constructs as predicted. A final factor analysis of these items uncovered seven factors accounting for 39% of the variance. These emergent factors corresponded to the hypothesized emotion constructs, with the exception of anger and fear, which were somewhat confounded. These findings lay the groundwork for the construction of an instrument to assess emotions multicomponentially.

Download Full-text

Particles Counting in Intracellular Images by Partial Least Squares Regression and HLAC Feature between Multiple Features

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.135.236 ◽

2015 ◽

Vol 135 (2) ◽

pp. 236-243

Author(s):

Shohei Kumagai ◽

Kazuhiro Hotta

Keyword(s):

Least Squares ◽

Partial Least Squares ◽

Partial Least Squares Regression ◽

Least Squares Regression ◽

Multiple Features

Download Full-text

Integrated approach to speaker recognition in forensic applications

International Journal of Speech Language and the Law ◽

10.1558/ijsll.v3i1.50 ◽

2013 ◽

Vol 3 (1) ◽

pp. 50-64

Author(s):

Wojciech Majewski ◽

Czeslaw Basztura

Keyword(s):

Speaker Recognition ◽

Integrated Approach

Download Full-text

TEXT-INDEPENDENT SPEAKER RECOGNITION USING COMBINED LPC AND MFC COEFFICIENTS

International Journal of Research in Engineering and Technology ◽

10.15623/ijret.2014.0306095 ◽

2014 ◽

Vol 03 (06) ◽

pp. 508-514

Author(s):

PPS Subhashini .

Keyword(s):

Speaker Recognition

Download Full-text

Speaker Identity Recognition by Acoustic and Visual Data Fusion through Personal Privacy for Smart Care and Service Applications

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2020.64.4.040404 ◽

2020 ◽

Vol 64 (4) ◽

pp. 40404-1-40404-16

Author(s):

I.-J. Ding ◽

C.-M. Ruan

Keyword(s):

Face Detection ◽

Speaker Recognition ◽

Visual Information ◽

Classification Tree ◽

Gaussian Mixture ◽

Recognition Method ◽

Indoor Space ◽

Identity Recognition ◽

Visual Identity ◽

Speaker Classification

Abstract With rapid developments in techniques related to the internet of things, smart service applications such as voice-command-based speech recognition and smart care applications such as context-aware-based emotion recognition will gain much attention and potentially be a requirement in smart home or office environments. In such intelligence applications, identity recognition of the specific member in indoor spaces will be a crucial issue. In this study, a combined audio-visual identity recognition approach was developed. In this approach, visual information obtained from face detection was incorporated into acoustic Gaussian likelihood calculations for constructing speaker classification trees to significantly enhance the Gaussian mixture model (GMM)-based speaker recognition method. This study considered the privacy of the monitored person and reduced the degree of surveillance. Moreover, the popular Kinect sensor device containing a microphone array was adopted to obtain acoustic voice data from the person. The proposed audio-visual identity recognition approach deploys only two cameras in a specific indoor space for conveniently performing face detection and quickly determining the total number of people in the specific space. Such information pertaining to the number of people in the indoor space obtained using face detection was utilized to effectively regulate the accurate GMM speaker classification tree design. Two face-detection-regulated speaker classification tree schemes are presented for the GMM speaker recognition method in this study—the binary speaker classification tree (GMM-BT) and the non-binary speaker classification tree (GMM-NBT). The proposed GMM-BT and GMM-NBT methods achieve excellent identity recognition rates of 84.28% and 83%, respectively; both values are higher than the rate of the conventional GMM approach (80.5%). Moreover, as the extremely complex calculations of face recognition in general audio-visual speaker recognition tasks are not required, the proposed approach is rapid and efficient with only a slight increment of 0.051 s in the average recognition time.

Download Full-text

Speaker recognition based on dynamic time warping and Gaussian mixture model

2020 39th Chinese Control Conference (CCC) ◽

10.23919/ccc50068.2020.9188632 ◽

2020 ◽

Author(s):

Nannan Zhang ◽

Yanru Yao

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Speaker Recognition ◽

Dynamic Time Warping ◽

Gaussian Mixture ◽

Time Warping ◽

Dynamic Time

Download Full-text

New Feature Vectors using GFCC for Speaker Identification

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i8.146 ◽

2018 ◽

Vol 6 (8) ◽

pp. 243

Author(s):

A. Nagesh

Keyword(s):

Speaker Recognition ◽

Speaker Identification ◽

Signal To Noise Ratio ◽

Main Idea ◽

Extraction Methods ◽

Identification System ◽

Identification Performance ◽

Feature Vectors ◽

Overall Performance ◽

New Feature

The feature vectors of speaker identification system plays a crucial role in the overall performance of the system. There are many new feature vectors extraction methods based on MFCC, but ultimately we want to maximize the performance of SID system. The objective of this paper to derive Gammatone Frequency Cepstral Coefficients (GFCC) based a new set of feature vectors using Gaussian Mixer model (GMM) for speaker identification. The MFCC are the default feature vectors for speaker recognition, but they are not very robust at the presence of additive noise. The GFCC features in recent studies have shown very good robustness against noise and acoustic change. The main idea is GFCC features based on GMM feature extraction is to improve the overall speaker identification performance in low signal to noise ratio (SNR) conditions.

Download Full-text

Faculty Opinions recommendation of Aire deficient mice develop multiple features of APECED phenotype and show altered immune response.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1004698.51404 ◽

2002 ◽

Author(s):

Jin-Xiong She

Keyword(s):

Immune Response ◽

Multiple Features ◽

Altered Immune Response ◽

Deficient Mice

Download Full-text